On Accounting for Sequence-Specific Bias in Genome-Wide Chromatin Accessibility Experiments: Recent Advances and Contradictions
نویسنده
چکیده
Uncovering the protein–DNA interactions involved in cell fate, development, and disease in a timeand cell-specific manner is a fundamental goal of molecular biology. The advent of the sequencing technologies has opened a new genomic era, uncovering the information encoded in genomes, epigenomes, and transcriptomes (McPherson, 2014). For example, the popular ChIPbased techniques ChIP-seq (Johnson et al., 2007; Robertson et al., 2007) and ChIP-exo (Rhee and Pugh, 2011) are widely used to detect transcription factor (TF)-binding sites using an antibody against a single protein of interest (Mahony and Pugh, 2015). Alternative protocols assaying the chromatin landscape, such as those based ondigestion byDNase I enzyme (DNase-seq),micrococcal nuclease (MNase-seq), and Tn5 transposase attack (ATAC-seq), enable the identification of DNAbinding protein footprints of many TFs in a single experiment (Tsompana and Buck, 2014). Timeseries experiments might be required for the identification of those TFs cataloged as pioneer factors, allowing their effects on chromatin to be investigated (Zaret and Carroll, 2011; Pajoro et al., 2014; Sherwood et al., 2014). Despite the initial promise of detecting the majority of TFs in one assay, DNA sequencespecific biases, together with TF-dependent binding kinetics, have been recently pinpointed as major confounding factors in DNase-seq experiments (Koohy et al., 2013; He et al., 2014; Raj and McVicker, 2014; Rusk, 2014; Sung et al., 2014). These influencing factors were not considered by any of the previous computational approaches for the analysis of next-generation sequencing chromatin accessibility data (Madrigal and Krajewski, 2012); neither those strategies based on TFgeneric DNase signature nor those based on TF-specific DNase signature (Luo and Hartemink, 2013).
منابع مشابه
I-40: Male Genome Programming, Infertility and Cancer
Background: During male germ cells differentiation, genomewide re-organizations and highly specific programming of the male genome occur. These changes not only include the large-scale meiotic shuffling of genes, taking place in spermatocytes, but also a complete “re-packaging” of the male genome in post meiotic cells, leading to a highly compacted nucleo-protamine structure in the mature sperm...
متن کاملAssaying the epigenome in limited numbers of cells.
Spectacular advances in the throughput of DNA sequencing have allowed genome-wide analysis of epigenetic features such as methylation, nucleosome position and post-translational modification, chromatin accessibility and connectivity, and transcription factor binding. However, for rare or precious biological samples, input requirements of many of these methods limit their application. In this re...
متن کاملA synergistic DNA logic predicts genome-wide chromatin accessibility.
Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SC...
متن کاملATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide.
This unit describes Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq), a method for mapping chromatin accessibility genome-wide. This method probes DNA accessibility with hyperactive Tn5 transposase, which inserts sequencing adapters into accessible regions of chromatin. Sequencing reads can then be used to infer regions of increased accessibility, as well as...
متن کاملInterplay between chromatin state, regulator binding, and regulatory motifs in six human cell types.
The regions bound by sequence-specific transcription factors can be highly variable across different cell types despite the static nature of the underlying genome sequence. This has been partly attributed to changes in chromatin accessibility, but a systematic picture has been hindered by the lack of large-scale data sets. Here, we use 456 binding experiments for 119 regulators and 84 chromatin...
متن کامل